Search CORE

89 research outputs found

Minimax risks for sparse regressions: Ultra-high-dimensional phenomenons

Author: Verzelen Nicolas
Publication venue
Publication date: 01/01/2012
Field of study

Consider the standard Gaussian linear regression model

Y=X\theta+\epsilon

, where

Y\in R^n

is a response vector and

X\in R^{n*p}

is a design matrix. Numerous work have been devoted to building efficient estimators of

\theta

when

p

is much larger than

n

. In such a situation, a classical approach amounts to assume that

\theta_0

is approximately sparse. This paper studies the minimax risks of estimation and testing over classes of

k

-sparse vectors

\theta

. These bounds shed light on the limitations due to high-dimensionality. The results encompass the problem of prediction (estimation of

X\theta

), the inverse problem (estimation of

\theta_0

) and linear testing (testing

X\theta=0

). Interestingly, an elbow effect occurs when the number of variables

k\log(p/k)

becomes large compared to

n

. Indeed, the minimax risks and hypothesis separation distances blow up in this ultra-high dimensional setting. We also prove that even dimension reduction techniques cannot provide satisfying results in an ultra-high dimensional setting. Moreover, we compute the minimax risks when the variance of the noise is unknown. The knowledge of this variance is shown to play a significant role in the optimal rates of estimation and testing. All these minimax bounds provide a characterization of statistical problems that are so difficult so that no procedure can provide satisfying results

arXiv.org e-Print Archive

Crossref

ProdInra

Technical appendix to "Adaptive estimation of stationary Gaussian fields"

Author: Verzelen Nicolas
Publication venue
Publication date: 30/08/2009
Field of study

This is a technical appendix to "Adaptive estimation of stationary Gaussian fields". We present several proofs that have been skipped in the main paper.Comment: 28 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

Adaptive estimation of covariance matrices via Cholesky decomposition

Author: Verzelen Nicolas
Publication venue
Publication date: 01/01/2010
Field of study

This paper studies the estimation of a large covariance matrix. We introduce a novel procedure called ChoSelect based on the Cholesky factor of the inverse covariance. This method uses a dimension reduction strategy by selecting the pattern of zero of the Cholesky factor. Alternatively, ChoSelect can be interpreted as a graph estimation procedure for directed Gaussian graphical models. Our approach is particularly relevant when the variables under study have a natural ordering (e.g. time series) or more generally when the Cholesky factor is approximately sparse. ChoSelect achieves non-asymptotic oracle inequalities with respect to the Kullback-Leibler entropy. Moreover, it satisfies various adaptive properties from a minimax point of view. We also introduce and study a two-stage procedure that combines ChoSelect with the Lasso. This last method enables the practitioner to choose his own trade-off between statistical efficiency and computational complexity. Moreover, it is consistent under weaker assumptions than the Lasso. The practical performances of the different procedures are assessed on numerical examples

arXiv.org e-Print Archive

ProdInra

High-dimensional Gaussian model selection on a Gaussian design

Author: Verzelen Nicolas
Publication venue
Publication date: 01/01/2008
Field of study

We consider the problem of estimating the conditional mean of a real Gaussian variable \nolinebreak Y=\sum_{i=1}^p\nolinebreak\theta_iX_i+\nolinebreak \epsilon where the vector of the covariates

(X_i)_{1\leq i\leq p}

follows a joint Gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a Gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least-squares type criterion. It handles a variety of problems such as ordered and complete variable selection, allows to incorporate some prior knowledge on the model and applies when the number of covariates

p

is larger than the number of observations

n

. Moreover, it is shown to achieve a non-asymptotic oracle inequality independently of the correlation structure of the covariates. We also exhibit various minimax rates of estimation in the considered framework and hence derive adaptiveness properties of our procedure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Numérisation de Documents Anciens Mathématiques

Hal-Diderot

Adaptive estimation of High-Dimensional Signal-to-Noise Ratios

Author: Gassiat Elisabeth
Verzelen Nicolas
Publication venue
Publication date: 16/03/2017
Field of study

We consider the equivalent problems of estimating the residual variance, the proportion of explained variance

\eta

and the signal strength in a high-dimensional linear regression model with Gaussian random design. Our aim is to understand the impact of not knowing the sparsity of the regression parameter and not knowing the distribution of the design on minimax estimation rates of

\eta

. Depending on the sparsity

k

of the regression parameter, optimal estimators of

\eta

either rely on estimating the regression parameter or are based on U-type statistics, and have minimax rates depending on

k

. In the important situation where

k

is unknown, we build an adaptive procedure whose convergence rate simultaneously achieves the minimax risk over all

k

up to a logarithmic loss which we prove to be non avoidable. Finally, the knowledge of the design distribution is shown to play a critical role. When the distribution of the design is unknown, consistent estimation of explained variance is indeed possible in much narrower regimes than for known design distribution

arXiv.org e-Print Archive

ProdInra

Partial recovery bounds for clustering with the relaxed $K$ means

Author: Giraud Christophe
Verzelen Nicolas
Publication venue
Publication date: 01/01/2018
Field of study

We investigate the clustering performances of the relaxed

K

means in the setting of sub-Gaussian Mixture Model (sGMM) and Stochastic Block Model (SBM). After identifying the appropriate signal-to-noise ratio (SNR), we prove that the misclassification error decay exponentially fast with respect to this SNR. These partial recovery bounds for the relaxed

K

means improve upon results currently known in the sGMM setting. In the SBM setting, applying the relaxed

K

means SDP allows to handle general connection probabilities whereas other SDPs investigated in the literature are restricted to the assortative case (where within group probabilities are larger than between group probabilities). Again, this partial recovery bound complements the state-of-the-art results. All together, these results put forward the versatility of the relaxed

K

means.Comment: 39 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Detection and Feature Selection in Sparse Mixture Models

Author: Arias-Castro Ery
Verzelen Nicolas
Publication venue
Publication date: 01/10/2016
Field of study

We consider Gaussian mixture models in high dimensions and concentrate on the twin tasks of detection and feature selection. Under sparsity assumptions on the difference in means, we derive information bounds and establish the performance of various procedures, including the top sparse eigenvalue of the sample covariance matrix and other projection tests based on moments, such as the skewness and kurtosis tests of Malkovich and Afifi (1973), and other variants which we were better able to control under the null.Comment: 70 page

arXiv.org e-Print Archive

CiteSeerX

HAL Descartes

Hal-Diderot

Optimal graphon estimation in cut distance

Author: Klopp Olga
Verzelen Nicolas
Publication venue
Publication date: 16/10/2018
Field of study

Consider the twin problems of estimating the connection probability matrix of an inhomogeneous random graph and the graphon of a W-random graph. We establish the minimax estimation rates with respect to the cut metric for classes of block constant matrices and step function graphons. Surprisingly, our results imply that, from the minimax point of view, the raw data, that is, the adjacency matrix of the observed graph, is already optimal and more involved procedures cannot improve the convergence rates for this metric. This phenomenon contrasts with optimal rates of convergence with respect to other classical distances for graphons such as the l 1 or l 2 metrics

arXiv.org e-Print Archive

HAL Descartes

HAL-Polytechnique